Cooperative Hierarchical Dirichlet Processes: Superposition vs. Maximization

نویسندگان

  • Junyu Xuan
  • Jie Lu
  • Guangquan Zhang
  • Richard Y. D. Xu
چکیده

The cooperative hierarchical structure is a common and significant data structure observed in, or adopted by, many research areas, such as: text mining (author-paper-word) and multi-label classification (label-instance-feature). Renowned Bayesian approaches for cooperative hierarchical structure modeling are mostly based on topic models. However, these approaches suffer from a serious issue in that the number of hidden topics/factors needs to be fixed in advance and an inappropriate number may lead to overfitting or underfitting. One elegant way to resolve this issue is Bayesian nonparametric learning, but existing work in this area still cannot be applied to cooperative hierarchical structure modeling. In this paper, we propose a cooperative hierarchical Dirichlet process (CHDP) to fill this gap. Each node in a cooperative hierarchical structure is assigned a Dirichlet process to model its weights on the infinite hidden factors/topics. Together with measure inheritance from hierarchical Dirichlet process, two kinds of measure cooperation, i.e., superposition and maximization, are defined to capture the many-to-many relationships in the cooperative hierarchical structure. Furthermore, two constructive representations for CHDP, i.e., stick-breaking ∗Corresponding author Email addresses: [email protected] (Junyu Xuan), [email protected] (Jie Lu), [email protected] (Guangquan Zhang), [email protected] (Richard Yi Da Xu) Preprint submitted to Elsevier July 19, 2017 ar X iv :1 70 7. 05 42 0v 1 [ cs .L G ] 1 8 Ju l 2 01 7 and international restaurant process, are designed to facilitate the model inference. Experiments on synthetic and real-world data with cooperative hierarchical structures demonstrate the properties and the ability of CHDP for cooperative hierarchical structure modeling and its potential for practical application scenarios.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dependent Hierarchical Normalized Random Measures for Dynamic Topic Modeling

We develop dependent hierarchical normalized random measures and apply them to dynamic topic modeling. The dependency arises via superposition, subsampling and point transition on the underlying Poisson processes of these measures. The measures used include normalised generalised Gamma processes that demonstrate power law properties, unlike Dirichlet processes used previously in dynamic topic m...

متن کامل

Time-Varying Topic Models using Dependent Dirichlet Processes

We lay the ground for extending Dirichlet Processes based clustering and factor models to explicitly include variability as a function of time (or other known covariates) by integrating a Dependent Dirichlet Processes into existing hierarchical topic models. Time-Varying Topic Models using Dependent Dirichlet Processes Nathan Srebro Sam Roweis Dept. of Computer Science, University of Toronto, C...

متن کامل

Online Data Clustering Using Variational Learning of a Hierarchical Dirichlet Process Mixture of Dirichlet Distributions

This paper proposes an online clustering approach based on both hierarchical Dirichlet processes and Dirichlet distributions. The deployment of hierarchical Dirichlet processes allows to resolve difficulties related to model selection thanks to its nonparametric nature that arises in the face of unknown number of mixture components. The consideration of the Dirichlet distribution is justified b...

متن کامل

Hierarchical Dirichlet Processes

We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture components between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from the data. In this setting it is natural to consider sets of Dirichlet processes, one for each group, where the well...

متن کامل

Hierarchical Double Dirichlet Process Mixture of Gaussian Processes

We consider an infinite mixture model of Gaussian processes that share mixture components between nonlocal clusters in data. Meeds and Osindero (2006) use a single Dirichlet process prior to specify a mixture of Gaussian processes using an infinite number of experts. In this paper, we extend this approach to allow for experts to be shared non-locally across the input domain. This is accomplishe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1707.05420  شماره 

صفحات  -

تاریخ انتشار 2017